Goto

Collaborating Authors

 causal strength





Causality-Induced Positional Encoding for Transformer-Based Representation Learning of Non-Sequential Features

Xu, Kaichen, Du, Yihang, Liu, Mianpeng, Yu, Zimu, Sun, Xiaobo

arXiv.org Artificial Intelligence

Positional encoding is essential for supplementing transformer with positional information of tokens. Existing positional encoding methods demand predefined token/feature order, rendering them unsuitable for real-world data with non-sequential yet causally-related features. To address this limitation, we propose CAPE, a novel method that identifies underlying causal structure over non-sequential features as a weighted directed acyclic graph (DAG) using generalized structural equation modeling. The DAG is then embedded in hyperbolic space where its geometric structure is well-preserved using a hyperboloid model-based approach that effectively captures two important causal graph properties (causal strength & causal specificity). This step yields causality-aware positional encodings for the features, which are converted into their rotary form for integrating with transformer's self-attention mechanism. Theoretical analysis reveals that CAPE-generated rotary positional encodings possess three valuable properties for enhanced self-attention, including causal distance-induced attenuation, causal generality-induced attenuation, and robustness to positional disturbances. We evaluate CAPE over both synthetic and real-word datasets, empirically demonstrating its theoretical properties and effectiveness in enhancing transformer for data with non-sequential features. Our code is available at https://github.com/Catchxu/CAPE.



Causal Explainability of Machine Learning in Heart Failure Prediction from Electronic Health Records

Hou, Yina, Rabbani, Shourav B., Hong, Liang, Diawara, Norou, Samad, Manar D.

arXiv.org Machine Learning

The importance of clinical variables in the prognosis of the disease is explained using statistical correlation or machine learning (ML). However, the predictive importance of these variables may not represent their causal relationships with diseases. This paper uses clinical variables from a heart failure (HF) patient cohort to investigate the causal explainability of important variables obtained in statistical and ML contexts. Due to inherent regression modeling, popular causal discovery methods strictly assume that the cause and effect variables are numerical and continuous. This paper proposes a new computational framework to enable causal structure discovery (CSD) and score the causal strength of mixed-type (categorical, numerical, binary) clinical variables for binary disease outcomes. In HF classification, we investigate the association between the importance rank order of three feature types: correlated features, features important for ML predictions, and causal features. Our results demonstrate that CSD modeling for nonlinear causal relationships is more meaningful than its linear counterparts. Feature importance obtained from nonlinear classifiers (e.g., gradient-boosting trees) strongly correlates with the causal strength of variables without differentiating cause and effect variables. Correlated variables can be causal for HF, but they are rarely identified as effect variables. These results can be used to add the causal explanation of variables important for ML-based prediction modeling.


SpaceTime: Causal Discovery from Non-Stationary Time Series

Mameche, Sarah, Cornanguer, Lénaïg, Ninad, Urmi, Vreeken, Jilles

arXiv.org Artificial Intelligence

Understanding causality is challenging and often complicated by changing causal relationships over time and across environments. Climate patterns, for example, shift over time with recurring seasonal trends, while also depending on geographical characteristics such as ecosystem variability. Existing methods for discovering causal graphs from time series either assume stationarity, do not permit both temporal and spatial distribution changes, or are unaware of locations with the same causal relationships. In this work, we therefore unify the three tasks of causal graph discovery in the non-stationary multi-context setting, of reconstructing temporal regimes, and of partitioning datasets and time intervals into those where invariant causal relationships hold. To construct a consistent score that forms the basis of our method, we employ the Minimum Description Length principle. Our resulting algorithm SPACETIME simultaneously accounts for heterogeneity across space and non-stationarity over time. Given multiple time series, it discovers regime changepoints and a temporal causal graph using non-parametric functional modeling and kernelized discrepancy testing. We also show that our method provides insights into real-world phenomena such as river-runoff measured at different catchments and biosphere-atmosphere interactions across ecosystems.


Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning

Bo, Shi, Xiao, Minheng

arXiv.org Artificial Intelligence

This paper presents a novel approach to root cause attribution of delivery risks within supply chains by integrating causal discovery with reinforcement learning. As supply chains become increasingly complex, traditional methods of root cause analysis struggle to capture the intricate interrelationships between various factors, often leading to spurious correlations and suboptimal decision-making. Our approach addresses these challenges by leveraging causal discovery to identify the true causal relationships between operational variables, and reinforcement learning to iteratively refine the causal graph. This method enables the accurate identification of key drivers of late deliveries, such as shipping mode and delivery status, and provides actionable insights for optimizing supply chain performance. We apply our approach to a real-world supply chain dataset, demonstrating its effectiveness in uncovering the underlying causes of delivery delays and offering strategies for mitigating these risks. The findings have significant implications for improving operational efficiency, customer satisfaction, and overall profitability within supply chains.


Deciphering interventional dynamical causality from non-intervention systems

Shi, Jifan, Li, Yang, Zhao, Juan, Leng, Siyang, Aihara, Kazuyuki, Chen, Luonan, Lin, Wei

arXiv.org Machine Learning

Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational criterion, Interventional Embedding Entropy (IEE), to quantify causality. The IEE criterion theoretically and numerically enables the deciphering of IntDC solely from observational (non-interventional) time-series data, without requiring any knowledge of dynamical models or real interventions in the considered system. Demonstrations of performance showed the accuracy and robustness of IEE on benchmark simulated systems as well as real-world systems, including the neural connectomes of C. elegans, COVID-19 transmission networks in Japan, and regulatory networks surrounding key circadian genes.


CausalScore: An Automatic Reference-Free Metric for Assessing Response Relevance in Open-Domain Dialogue Systems

Feng, Tao, Qu, Lizhen, Kang, Xiaoxi, Haffari, Gholamreza

arXiv.org Artificial Intelligence

Automatically evaluating the quality of responses in open-domain dialogue systems is a challenging but crucial task. Current evaluation metrics often fail to align with human judgments, especially when assessing responses that are grammatically correct. To address this issue, we propose a novel metric, called CausalScore, which assesses the relevance of responses by measuring the causal strength between dialogue histories and responses. The causal strength is estimated by utilizing both unconditional dependence and conditional dependencies from the dialogue history to responses. We compare our metric with the existing competitive metrics in terms of their alignment with human judgements. Our experimental results demonstrate that CausalScore significantly surpasses existing state-of-the-art metrics by aligning better with human judgements. Additionally, we collect a new dialogue dataset CGDIALOG+ with human-annotated causal relations and a set of pairwise human judgements to facilitate the development of future automatic metrics.